Skip to content

gh-124397: Add free-threading support for iterators.#148894

Open
rhettinger wants to merge 11 commits intopython:mainfrom
rhettinger:iterator_synchronization
Open

gh-124397: Add free-threading support for iterators.#148894
rhettinger wants to merge 11 commits intopython:mainfrom
rhettinger:iterator_synchronization

Conversation

@rhettinger
Copy link
Copy Markdown
Contributor

@rhettinger rhettinger commented Apr 22, 2026

@rhettinger rhettinger added the type-feature A feature request or enhancement label Apr 22, 2026
@rhettinger rhettinger changed the title Issue-124397: Add free-threading support for iterators. gh-124397: Add free-threading support for iterators. Apr 22, 2026
@rhettinger rhettinger marked this pull request as draft April 22, 2026 22:39
@rhettinger rhettinger requested a review from colesbury April 23, 2026 02:00
@rhettinger rhettinger force-pushed the iterator_synchronization branch from 21fca13 to 4c2bad0 Compare April 23, 2026 02:33
@rhettinger rhettinger marked this pull request as ready for review April 23, 2026 02:52
Copy link
Copy Markdown
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread Doc/library/threading.rst

import threading

source = range(5)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a very compelling example because range() you can iterate over range concurrent from multiple threads already.

Maybe something like a simple generator would be more useful as an example?

Comment thread Doc/library/threading.rst

.. doctest::

import threading
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs >>> for doctest to run. Alternatively, use .. code-block:: python

Comment thread Lib/threading.py Outdated
Comment on lines +863 to +864
self.iterator = iter(iterable)
self.lock = Lock()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prefix .iterator and .lock with _ so that they aren't considered part of serialize's public API.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Will do.

Comment thread Lib/threading.py
Comment thread Lib/threading.py
Comment thread Lib/threading.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test that an iterator __next__ raising an exception behaves properly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread Lib/threading.py

## Synchronization tools for iterators #####################

class serialize:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a subclass of Iterable? (suggested by @serhiy-storchaka on the issue)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We don't do that for most iterables. Also, it isn't necessary. The ABC doesn't add any mixin methods and isinstance(serialized_iterator, Iterable) already returns True.

Comment thread Doc/library/threading.rst
with threading_helper.wait_threads_exit():
for worker in workers:
worker.start()
start.set()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't ensure all threads are started before they are released from wait(). A barrier, rather than an event, should be preferred in these scenarios.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Apr 28, 2026

@vstinner
Copy link
Copy Markdown
Member

I concur with @ZeroIntensity who wrote that threading is a bad place for this. I'm not sure that threading is a good home for the 3 added functions (serialize, synchronized, concurrent_tee). IMO itertools would be a better home for them.

For example, I would expect @threading.synchronized (which sounds like synchronized in Java) to work on any function, such as:

import threading

class Demo:
    _next_id = 0

    @threading.synchronized
    def get_id(self):
        self._next_id += 1
        return self._next_id

But calling get_id() fails with:

Traceback (most recent call last):
  File "/home/vstinner/python/main/x.py", line 12, in <module>
    print(demo.get_id())
          ~~~~~~~~~~~^^
  File "/home/vstinner/python/main/Lib/threading.py", line 930, in inner
    return serialize(iterator)
  File "/home/vstinner/python/main/Lib/threading.py", line 869, in __init__
    self._iterator = iter(iterable)
                     ~~~~^^^^^^^^^^
TypeError: 'int' object is not iterable

concurrent_tee() sounds very specific, whereas currently threading only contains very generic tooling to write multithreaded code. itertools contains many functions, so should we expect to pop up more itertools functions in threading? As an user, it's uneasy for me to choose between threading.concurrent_tee() and itertools.tee() who are in two different modules. I would expect them to be in the same module, and I expect itertools.tee() to mention concurrent_tee() and explain when it should be used instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants